Overview

Dataset statistics

Number of variables11
Number of observations891
Missing cells179
Missing cells (%)1.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory76.7 KiB
Average record size in memory88.1 B

Variable types

Numeric6
Text1
Categorical4

Alerts

Cabin is highly overall correlated with Fare and 2 other fieldsHigh correlation
Fare is highly overall correlated with Cabin and 1 other fieldsHigh correlation
Fare.1 is highly overall correlated with Cabin and 1 other fieldsHigh correlation
Pclass is highly overall correlated with CabinHigh correlation
Age has 177 (19.9%) missing valuesMissing
PassengerId is uniformly distributedUniform
PassengerId has unique valuesUnique
Name has unique valuesUnique
Fare has 15 (1.7%) zerosZeros
SibSp has 608 (68.2%) zerosZeros
Parch has 678 (76.1%) zerosZeros
Fare.1 has 15 (1.7%) zerosZeros

Reproduction

Analysis started2024-01-25 07:32:30.856757
Analysis finished2024-01-25 07:32:37.642802
Duration6.79 seconds
Software versionydata-profiling vv4.6.4
Download configurationconfig.json

Variables

PassengerId
Real number (ℝ)

UNIFORM  UNIQUE 

Distinct891
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean446
Minimum1
Maximum891
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2024-01-25T13:02:37.815711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile45.5
Q1223.5
median446
Q3668.5
95-th percentile846.5
Maximum891
Range890
Interquartile range (IQR)445

Descriptive statistics

Standard deviation257.35384
Coefficient of variation (CV)0.57702655
Kurtosis-1.2
Mean446
Median Absolute Deviation (MAD)223
Skewness0
Sum397386
Variance66231
MonotonicityStrictly increasing
2024-01-25T13:02:38.023584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
599 1
 
0.1%
588 1
 
0.1%
589 1
 
0.1%
590 1
 
0.1%
591 1
 
0.1%
592 1
 
0.1%
593 1
 
0.1%
594 1
 
0.1%
595 1
 
0.1%
Other values (881) 881
98.9%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
891 1
0.1%
890 1
0.1%
889 1
0.1%
888 1
0.1%
887 1
0.1%
886 1
0.1%
885 1
0.1%
884 1
0.1%
883 1
0.1%
882 1
0.1%

Name
Text

UNIQUE 

Distinct891
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size7.1 KiB
2024-01-25T13:02:38.347404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length82
Median length52
Mean length26.965208
Min length12

Characters and Unicode

Total characters24026
Distinct characters60
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique891 ?
Unique (%)100.0%

Sample

1st rowBraund, Mr. Owen Harris
2nd rowCumings, Mrs. John Bradley (Florence Briggs Thayer)
3rd rowHeikkinen, Miss. Laina
4th rowFutrelle, Mrs. Jacques Heath (Lily May Peel)
5th rowAllen, Mr. William Henry
ValueCountFrequency (%)
mr 521
 
14.4%
miss 182
 
5.0%
mrs 129
 
3.6%
william 64
 
1.8%
john 44
 
1.2%
master 40
 
1.1%
henry 35
 
1.0%
george 24
 
0.7%
james 24
 
0.7%
charles 23
 
0.6%
Other values (1515) 2538
70.0%
2024-01-25T13:02:38.918078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2735
 
11.4%
r 1958
 
8.1%
e 1703
 
7.1%
a 1657
 
6.9%
i 1325
 
5.5%
n 1304
 
5.4%
s 1297
 
5.4%
M 1128
 
4.7%
l 1067
 
4.4%
o 1008
 
4.2%
Other values (50) 8844
36.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 15446
64.3%
Uppercase Letter 3645
 
15.2%
Space Separator 2735
 
11.4%
Other Punctuation 1899
 
7.9%
Close Punctuation 144
 
0.6%
Open Punctuation 144
 
0.6%
Dash Punctuation 13
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 1958
12.7%
e 1703
11.0%
a 1657
10.7%
i 1325
8.6%
n 1304
8.4%
s 1297
8.4%
l 1067
 
6.9%
o 1008
 
6.5%
t 667
 
4.3%
h 517
 
3.3%
Other values (16) 2943
19.1%
Uppercase Letter
ValueCountFrequency (%)
M 1128
30.9%
A 250
 
6.9%
J 215
 
5.9%
H 203
 
5.6%
S 180
 
4.9%
C 172
 
4.7%
E 166
 
4.6%
W 143
 
3.9%
B 140
 
3.8%
L 129
 
3.5%
Other values (15) 919
25.2%
Other Punctuation
ValueCountFrequency (%)
. 892
47.0%
, 891
46.9%
" 106
 
5.6%
' 9
 
0.5%
/ 1
 
0.1%
Space Separator
ValueCountFrequency (%)
2735
100.0%
Close Punctuation
ValueCountFrequency (%)
) 144
100.0%
Open Punctuation
ValueCountFrequency (%)
( 144
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 13
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 19091
79.5%
Common 4935
 
20.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 1958
 
10.3%
e 1703
 
8.9%
a 1657
 
8.7%
i 1325
 
6.9%
n 1304
 
6.8%
s 1297
 
6.8%
M 1128
 
5.9%
l 1067
 
5.6%
o 1008
 
5.3%
t 667
 
3.5%
Other values (41) 5977
31.3%
Common
ValueCountFrequency (%)
2735
55.4%
. 892
 
18.1%
, 891
 
18.1%
) 144
 
2.9%
( 144
 
2.9%
" 106
 
2.1%
- 13
 
0.3%
' 9
 
0.2%
/ 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24026
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2735
 
11.4%
r 1958
 
8.1%
e 1703
 
7.1%
a 1657
 
6.9%
i 1325
 
5.5%
n 1304
 
5.4%
s 1297
 
5.4%
M 1128
 
4.7%
l 1067
 
4.4%
o 1008
 
4.2%
Other values (50) 8844
36.8%

Fare
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct248
Distinct (%)27.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.204208
Minimum0
Maximum512.3292
Zeros15
Zeros (%)1.7%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2024-01-25T13:02:39.153934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile7.225
Q17.9104
median14.4542
Q331
95-th percentile112.07915
Maximum512.3292
Range512.3292
Interquartile range (IQR)23.0896

Descriptive statistics

Standard deviation49.693429
Coefficient of variation (CV)1.5430725
Kurtosis33.398141
Mean32.204208
Median Absolute Deviation (MAD)6.9042
Skewness4.7873165
Sum28693.949
Variance2469.4368
MonotonicityNot monotonic
2024-01-25T13:02:39.406792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8.05 43
 
4.8%
13 42
 
4.7%
7.8958 38
 
4.3%
7.75 34
 
3.8%
26 31
 
3.5%
10.5 24
 
2.7%
7.925 18
 
2.0%
7.775 16
 
1.8%
7.2292 15
 
1.7%
0 15
 
1.7%
Other values (238) 615
69.0%
ValueCountFrequency (%)
0 15
1.7%
4.0125 1
 
0.1%
5 1
 
0.1%
6.2375 1
 
0.1%
6.4375 1
 
0.1%
6.45 1
 
0.1%
6.4958 2
 
0.2%
6.75 2
 
0.2%
6.8583 1
 
0.1%
6.95 1
 
0.1%
ValueCountFrequency (%)
512.3292 3
0.3%
263 4
0.4%
262.375 2
0.2%
247.5208 2
0.2%
227.525 4
0.4%
221.7792 1
 
0.1%
211.5 1
 
0.1%
211.3375 3
0.3%
164.8667 2
0.2%
153.4625 3
0.3%

Pclass
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.1 KiB
3
491 
1
216 
2
184 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters891
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row1
3rd row3
4th row1
5th row3

Common Values

ValueCountFrequency (%)
3 491
55.1%
1 216
24.2%
2 184
 
20.7%

Length

2024-01-25T13:02:39.617679image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-25T13:02:39.761596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
3 491
55.1%
1 216
24.2%
2 184
 
20.7%

Most occurring characters

ValueCountFrequency (%)
3 491
55.1%
1 216
24.2%
2 184
 
20.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 891
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 491
55.1%
1 216
24.2%
2 184
 
20.7%

Most occurring scripts

ValueCountFrequency (%)
Common 891
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 491
55.1%
1 216
24.2%
2 184
 
20.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 891
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 491
55.1%
1 216
24.2%
2 184
 
20.7%

Sex
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size7.1 KiB
0
577 
1
314 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters891
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 577
64.8%
1 314
35.2%

Length

2024-01-25T13:02:39.922495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-25T13:02:40.086402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 577
64.8%
1 314
35.2%

Most occurring characters

ValueCountFrequency (%)
0 577
64.8%
1 314
35.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 891
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 577
64.8%
1 314
35.2%

Most occurring scripts

ValueCountFrequency (%)
Common 891
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 577
64.8%
1 314
35.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 891
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 577
64.8%
1 314
35.2%

Age
Real number (ℝ)

MISSING 

Distinct88
Distinct (%)12.3%
Missing177
Missing (%)19.9%
Infinite0
Infinite (%)0.0%
Mean29.699118
Minimum0.42
Maximum80
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2024-01-25T13:02:40.275302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.42
5-th percentile4
Q120.125
median28
Q338
95-th percentile56
Maximum80
Range79.58
Interquartile range (IQR)17.875

Descriptive statistics

Standard deviation14.526497
Coefficient of variation (CV)0.48912219
Kurtosis0.17827415
Mean29.699118
Median Absolute Deviation (MAD)9
Skewness0.38910778
Sum21205.17
Variance211.01912
MonotonicityNot monotonic
2024-01-25T13:02:40.523152image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
24 30
 
3.4%
22 27
 
3.0%
18 26
 
2.9%
28 25
 
2.8%
30 25
 
2.8%
19 25
 
2.8%
21 24
 
2.7%
25 23
 
2.6%
36 22
 
2.5%
29 20
 
2.2%
Other values (78) 467
52.4%
(Missing) 177
 
19.9%
ValueCountFrequency (%)
0.42 1
 
0.1%
0.67 1
 
0.1%
0.75 2
 
0.2%
0.83 2
 
0.2%
0.92 1
 
0.1%
1 7
0.8%
2 10
1.1%
3 6
0.7%
4 10
1.1%
5 4
 
0.4%
ValueCountFrequency (%)
80 1
 
0.1%
74 1
 
0.1%
71 2
0.2%
70.5 1
 
0.1%
70 2
0.2%
66 1
 
0.1%
65 3
0.3%
64 2
0.2%
63 2
0.2%
62 4
0.4%

SibSp
Real number (ℝ)

ZEROS 

Distinct7
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.52300786
Minimum0
Maximum8
Zeros608
Zeros (%)68.2%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2024-01-25T13:02:40.739029image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum8
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.1027434
Coefficient of variation (CV)2.1084644
Kurtosis17.88042
Mean0.52300786
Median Absolute Deviation (MAD)0
Skewness3.6953517
Sum466
Variance1.2160431
MonotonicityNot monotonic
2024-01-25T13:02:41.103818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 608
68.2%
1 209
 
23.5%
2 28
 
3.1%
4 18
 
2.0%
3 16
 
1.8%
8 7
 
0.8%
5 5
 
0.6%
ValueCountFrequency (%)
0 608
68.2%
1 209
 
23.5%
2 28
 
3.1%
3 16
 
1.8%
4 18
 
2.0%
5 5
 
0.6%
8 7
 
0.8%
ValueCountFrequency (%)
8 7
 
0.8%
5 5
 
0.6%
4 18
 
2.0%
3 16
 
1.8%
2 28
 
3.1%
1 209
 
23.5%
0 608
68.2%

Parch
Real number (ℝ)

ZEROS 

Distinct7
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.38159371
Minimum0
Maximum6
Zeros678
Zeros (%)76.1%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2024-01-25T13:02:41.275729image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum6
Range6
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.80605722
Coefficient of variation (CV)2.1123441
Kurtosis9.7781252
Mean0.38159371
Median Absolute Deviation (MAD)0
Skewness2.749117
Sum340
Variance0.64972824
MonotonicityNot monotonic
2024-01-25T13:02:41.421646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 678
76.1%
1 118
 
13.2%
2 80
 
9.0%
5 5
 
0.6%
3 5
 
0.6%
4 4
 
0.4%
6 1
 
0.1%
ValueCountFrequency (%)
0 678
76.1%
1 118
 
13.2%
2 80
 
9.0%
3 5
 
0.6%
4 4
 
0.4%
5 5
 
0.6%
6 1
 
0.1%
ValueCountFrequency (%)
6 1
 
0.1%
5 5
 
0.6%
4 4
 
0.4%
3 5
 
0.6%
2 80
 
9.0%
1 118
 
13.2%
0 678
76.1%

Embarked
Categorical

Distinct3
Distinct (%)0.3%
Missing2
Missing (%)0.2%
Memory size7.1 KiB
1.0
644 
3.0
168 
2.0
77 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2667
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row3.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.0 644
72.3%
3.0 168
 
18.9%
2.0 77
 
8.6%
(Missing) 2
 
0.2%

Length

2024-01-25T13:02:41.584552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-25T13:02:41.743452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1.0 644
72.4%
3.0 168
 
18.9%
2.0 77
 
8.7%

Most occurring characters

ValueCountFrequency (%)
. 889
33.3%
0 889
33.3%
1 644
24.1%
3 168
 
6.3%
2 77
 
2.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1778
66.7%
Other Punctuation 889
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 889
50.0%
1 644
36.2%
3 168
 
9.4%
2 77
 
4.3%
Other Punctuation
ValueCountFrequency (%)
. 889
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2667
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 889
33.3%
0 889
33.3%
1 644
24.1%
3 168
 
6.3%
2 77
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2667
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 889
33.3%
0 889
33.3%
1 644
24.1%
3 168
 
6.3%
2 77
 
2.9%

Cabin
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size7.1 KiB
0
687 
1
204 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters891
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 687
77.1%
1 204
 
22.9%

Length

2024-01-25T13:02:41.923351image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-25T13:02:42.090255image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 687
77.1%
1 204
 
22.9%

Most occurring characters

ValueCountFrequency (%)
0 687
77.1%
1 204
 
22.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 891
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 687
77.1%
1 204
 
22.9%

Most occurring scripts

ValueCountFrequency (%)
Common 891
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 687
77.1%
1 204
 
22.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 891
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 687
77.1%
1 204
 
22.9%

Fare.1
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct248
Distinct (%)27.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.204208
Minimum0
Maximum512.3292
Zeros15
Zeros (%)1.7%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2024-01-25T13:02:42.306130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile7.225
Q17.9104
median14.4542
Q331
95-th percentile112.07915
Maximum512.3292
Range512.3292
Interquartile range (IQR)23.0896

Descriptive statistics

Standard deviation49.693429
Coefficient of variation (CV)1.5430725
Kurtosis33.398141
Mean32.204208
Median Absolute Deviation (MAD)6.9042
Skewness4.7873165
Sum28693.949
Variance2469.4368
MonotonicityNot monotonic
2024-01-25T13:02:42.562992image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8.05 43
 
4.8%
13 42
 
4.7%
7.8958 38
 
4.3%
7.75 34
 
3.8%
26 31
 
3.5%
10.5 24
 
2.7%
7.925 18
 
2.0%
7.775 16
 
1.8%
7.2292 15
 
1.7%
0 15
 
1.7%
Other values (238) 615
69.0%
ValueCountFrequency (%)
0 15
1.7%
4.0125 1
 
0.1%
5 1
 
0.1%
6.2375 1
 
0.1%
6.4375 1
 
0.1%
6.45 1
 
0.1%
6.4958 2
 
0.2%
6.75 2
 
0.2%
6.8583 1
 
0.1%
6.95 1
 
0.1%
ValueCountFrequency (%)
512.3292 3
0.3%
263 4
0.4%
262.375 2
0.2%
247.5208 2
0.2%
227.525 4
0.4%
221.7792 1
 
0.1%
211.5 1
 
0.1%
211.3375 3
0.3%
164.8667 2
0.2%
153.4625 3
0.3%

Interactions

2024-01-25T13:02:36.142941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:31.579554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:32.542013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:33.383520image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:34.287002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:35.227465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:36.274864image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:31.723472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:32.671930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:33.524439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:34.441916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:35.361387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:36.414784image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:31.855397image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:32.803853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:33.661371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:34.594828image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:35.511301image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:36.588686image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:32.000313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:32.948770image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:33.834266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:34.744740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:35.691198image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:36.741596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:32.153227image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:33.108679image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:33.977180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:34.913647image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:35.853110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:36.881517image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:32.405081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:33.251595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:34.136091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:35.079557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-25T13:02:35.993026image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-25T13:02:42.753873image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
AgeCabinEmbarkedFareFare.1ParchPassengerIdPclassSexSibSp
Age1.0000.2580.0650.1350.135-0.2540.0410.2690.099-0.182
Cabin0.2581.0000.2280.5390.5390.0800.0200.7900.1340.052
Embarked0.0650.2281.0000.0770.077-0.029-0.0180.2600.113-0.012
Fare0.1350.5390.0771.0001.0000.410-0.0140.4790.1890.447
Fare.10.1350.5390.0771.0001.0000.410-0.0140.4790.1890.447
Parch-0.2540.080-0.0290.4100.4101.0000.0010.0220.2470.450
PassengerId0.0410.020-0.018-0.014-0.0140.0011.0000.0320.066-0.061
Pclass0.2690.7900.2600.4790.4790.0220.0321.0000.130-0.043
Sex0.0990.1340.1130.1890.1890.2470.0660.1301.0000.195
SibSp-0.1820.052-0.0120.4470.4470.450-0.061-0.0430.1951.000

Missing values

2024-01-25T13:02:37.113384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-25T13:02:37.377233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-01-25T13:02:37.554851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

PassengerIdNameFarePclassSexAgeSibSpParchEmbarkedCabinFare.1
01Braund, Mr. Owen Harris7.25003022.0101.007.2500
12Cumings, Mrs. John Bradley (Florence Briggs Thayer)71.28331138.0103.0171.2833
23Heikkinen, Miss. Laina7.92503126.0001.007.9250
34Futrelle, Mrs. Jacques Heath (Lily May Peel)53.10001135.0101.0153.1000
45Allen, Mr. William Henry8.05003035.0001.008.0500
56Moran, Mr. James8.458330NaN002.008.4583
67McCarthy, Mr. Timothy J51.86251054.0001.0151.8625
78Palsson, Master. Gosta Leonard21.0750302.0311.0021.0750
89Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)11.13333127.0021.0011.1333
910Nasser, Mrs. Nicholas (Adele Achem)30.07082114.0103.0030.0708
PassengerIdNameFarePclassSexAgeSibSpParchEmbarkedCabinFare.1
881882Markun, Mr. Johann7.89583033.0001.007.8958
882883Dahlberg, Miss. Gerda Ulrika10.51673122.0001.0010.5167
883884Banfield, Mr. Frederick James10.50002028.0001.0010.5000
884885Sutehall, Mr. Henry Jr7.05003025.0001.007.0500
885886Rice, Mrs. William (Margaret Norton)29.12503139.0052.0029.1250
886887Montvila, Rev. Juozas13.00002027.0001.0013.0000
887888Graham, Miss. Margaret Edith30.00001119.0001.0130.0000
888889Johnston, Miss. Catherine Helen "Carrie"23.450031NaN121.0023.4500
889890Behr, Mr. Karl Howell30.00001026.0003.0130.0000
890891Dooley, Mr. Patrick7.75003032.0002.007.7500